Using Domain Similarity for Performance Estimation
نویسندگان
چکیده
Many natural language processing (NLP) tools exhibit a decrease in performance when they are applied to data that is linguistically different from the corpus used during development. This makes it hard to develop NLP tools for domains for which annotated corpora are not available. This paper explores a number of metrics that attempt to predict the cross-domain performance of an NLP tool through statistical inference. We apply different similarity metrics to compare different domains and investigate the correlation between similarity and accuracy loss of NLP tool. We find that the correlation between the performance of the tool and the similarity metric is linear and that the latter can therefore be used to predict the performance of an NLP tool on out-of-domain data. The approach also provides a way to quantify the difference between domains.
منابع مشابه
Case Mix Planning using The Technique for Order of Preference by Similarity to Ideal Solution and Robust Estimation: a Case Study
Management of surgery units and operating room (OR) play key roles in optimizing the utilization of hospitals. On this line Case Mix Planning (CMP) is normally applied to long term planning of OR. This refers to allocating OR time to each patient’s group. In this paper a mathematical model is applied to optimize the allocation of OR time among surgical groups. In addition, another technique is ...
متن کاملChannel Effect Compensation in OFDM System under Short CP Length Using Adaptive Filter in Wavelet Transform Domain
Channel estimation in communication systems is one of the most important issues that can reduce the error rate of sending and receiving information as much as possible. In this regard, estimation of OFDM-based wireless channels using known sub-carriers as pilot is of particular importance in frequency domain. In this paper, channel estimation under short cyclic prefix (CP) in OFDM system is con...
متن کاملChannel Estimation and CFO Compensation in OFDM System Using Adaptive Filters in Wavelet Transform Domain
Abstarct In this paper, combination of channel, receiver frequency-dependent IQ imbalance and carrier frequency offset estimation under short cyclic prefix (CP) length are considered in OFDM system. An adaptive algorithm based on the set-membership filtering (SMF) algorithm is used to compensate for these impairments. In short CP length, per-tone equalization (PTEQ) structure is used to avoid i...
متن کاملEvaluation and Comparison of Topographic Correction Models Is Applied on the Series Landsat Images Using Spectrometery Data
The effect of topography on the radiance record in satellite image, probably reduce the accuracy of algorithem impliementation on the images . Therefore, to reduce the effect of topography, various correction models based on interaction between light and object needs to be defined. This research introduces lombertin correction model (Cosine model) and non_lombertin correction model (mineart and...
متن کاملDetermination of Stability Domains for Nonlinear Dynamical Systems Using the Weighted Residuals Method
Finding a suitable estimation of stability domain around stable equilibrium points is an important issue in the study of nonlinear dynamical systems. This paper intends to apply a set of analytical-numerical methods to estimate the region of attraction for autonomous nonlinear systems. In mechanical and structural engineering, autonomous systems could be found in large deformation problems or c...
متن کامل